Automatic Transcription of Lithuanian Text Using Dictionary

نویسندگان

  • Mantas Skripkauskas
  • Laimutis Telksnys
چکیده

There is presented a technique of transcribing Lithuanian text into phonemes for speech recognition. Text-phoneme transformation has been made by formal rules and the dictionary. Formal rules were designed to set the relationship between segments of the text and units of formalized speech sounds – phonemes, dictionary – to correct transcription and specify stress mark and position. Proposed the automatic transcription technique was tested by comparing its results with manually obtained ones. The experiment has shown that less than 6% of transcribed words have not matched.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Stressing of the Lithuanian Text on the Basis of a Dictionary

The paper deals with one of the components of text-to-speech synthesis of the Lithuanian language, namely – automatic text stressing. The present work substantiates the necessity to divide words into fixed and variable parts used to build different grammatical forms, as well as to store only those parts rather than the whole worlds in the dictionary. According to the inflexion method, all words...

متن کامل

Automatic Stressing of the Lithuanian Nouns and Adjectives on the Basis of Rules

The paper deals with automatic stressing of the Lithuanian text. In the previous work the author presented an algorithm for automatic stressing of the Lithuanian text on the basis of a dictionary. The aim of the present work is to improve the above mentioned algorithm by including formal stressing rules for nouns and adjectives. By means of these rules such words as diminutives, names and degre...

متن کامل

Automatic Lemmatisation of Lithuanian MWEs

This article presents a study of lemmatisation of flexible multiword expressions in Lithuanian. An approach based on syntactic analysis designed for multiword term lemmatisation was adapted for a broader range of MWEs taken from the Dictionary of Lithuanian Nominal Phrases. In the present analysis, the main lemmatisation errors are identified and some improvements are proposed. It shows that au...

متن کامل

Transcribing of the Lithuanian Text Using Formal Rules

This paper deals with one of the components of text-to-speech synthesis of Lithuanian language namely – text transcription. Formal rules’ method is used for text transcription. In this work the suitability of this method is grounded, an analysis of the form of rules to fit is made and the set of rules and interpreting algorithm is presented. Contextual information, features of stress, syllable ...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Informatica, Lith. Acad. Sci.

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2006